A Survey on an Integrated Neural Framework for Real-Time Deepfake Detection across Multimedia to Mitigate Misleading Content

Authors: Shwetha A B, Sushmitha R N, S Sahana, Vyshnavi N, Thrisha M

DOI Link: https://doi.org/10.22214/ijraset.2025.74208

Abstract

with the rapid proliferation of information across digital media, the assessment of information reliability has become increasingly critical for individuals and societies. Although the concept of deepfakes is not entirely novel, their widespread prevalence has emerged as pressing concern. The influence of deepfakes and disinformation extends from provoking individuals to shaping public opinion, misleading communities, and potentially impacting national stability. A variety of online techniques exits for both the detection and generation of deepfakes. This study, through a systematic literature review, examines automated detection and generation approaches, encompassing methods, frameworks, algorithms, and tools across multiple modalities (Audio, image and video). Furthermore, it explores the applicability of these approaches in diverse contexts to effectively counter the dissemination of deepfakes and propagation of disinformation. In this paper, a system is created that uses a Convolutional Neural Network (CNN) with EfficientNet and LSTM designs to pull out details from each frame. The Multi-Task Cascaded Convolutional Neural Network (MTCNN) algorithm is used for face detection and alignment in images. It utilizes a series of cascaded convolutional neural networks to efficiently locate faces and facial landmarks. Experimental analysis is performed using the DFDC deepfake detection challenge on Kaggle. These deep learning-based methods are optimized to increase accuracy and decrease training time by using this dataset for training and testing. This study concludes with policy recommendations derived from an analysis of advanced artificial intelligence (AI) methods for deepfake detection and generation in digital media. The review consolidates recent progress in detection and generation frameworks, providing a comprehensive overview of current capabilities. Furthermore, it highlights the potential of AI-driven approaches to improve detection accuracy and support regulatory measures aimed at mitigating the societal risks of deepfakes.

Introduction

Deepfakes are AI-generated media (video, audio, and images) created using deep learning algorithms to produce realistic but false content. Since 2019, their spread has increased, especially during the COVID-19 pandemic. Public figures are common targets, and deepfakes pose serious risks such as:

Political manipulation and election interference
Financial market destabilization
Religious incitement
Military deception

???? Deep Learning for Deepfake Detection

Deep learning (DL), using artificial neural networks, is central to detecting deepfakes. Key architectures include:

1. Convolutional Neural Networks (CNNs)

Automatically extract features from images using kernels.
Common in image classification and fake image detection.

2. Recurrent Neural Networks (RNNs)

Used for analyzing temporal inconsistencies in video frames.

???? Dataset Used

DFDC (Deepfake Detection Challenge Dataset): 100,000 videos, balanced between real and fake after frame extraction.
Data Augmentation: Images were flipped to increase training data.

????‍?? Face Detection & Recognition – MTCNN

MTCNN is used for accurate and efficient face detection and recognition.
Process involves P-Net, R-Net, and O-Net to refine face coordinates.
Integrated with FaceNet for face alignment and training.

???? Research Motivation & Contributions

Most existing research either lacks depth or omits policy implications. This work aims to:

Analyze both deepfake generation and detection.
Cover audio-visual and policy aspects.
Propose AI-based frameworks to mitigate disinformation.

Key Contributions

Systematic review of deepfake techniques (audio, image, video).
Identification of research gaps.
Consolidation of existing tools and algorithms.
Policy recommendations and future trend analysis.
Insights into real-world application challenges and limitations.

???? Background: Fake News Detection Techniques

Multiple strategies and models have been developed, including:

Multilingual models using summarization.
Hybrid CNN-LSTM models with explainability (XAI).
Digital watermarking for tamper detection.
Emotion-aware multitask models for nuanced content assessment.

???? Literature Survey Highlights

1. Heidari et al. (2023) – CNN-Based Image Forgery Detection

Uses image recompression differences to identify manipulations.
Lightweight, real-time capable; 92.23% accuracy.

2. Li et al. (2020) – FF-LBPH with DBN Classifier

High accuracy (up to 98.82%) for facial forgery detection.
Combines feature extraction with deep belief networks.

3. Yogesh et al. (2023) – Dense CNN Architecture

Achieves 97.2% accuracy on selected datasets.
Limitations include overfitting, modality restriction, and poor generalization.

4. El-Gayar et al. (2024) – Graph Neural Networks (GNNs)

Captures relationships between facial features.
Challenges include graph construction, adversarial attacks, and scalability.

5. Hanqing et al. (2021) – Multi-attentional Networks

Fine-grained classification using attention over facial regions.
Struggles with generalization, explainability, and low-resolution content.

Conclusion

This paper presents a neural network based method for distinguishing between Deep Fake and real videos. With the output showing high prediction confidence it can be concluded that the project was successfully completed. Following an in depth analysis of previously proposed algorithm, the project design was developed. The theoretical background of all the resources and technologies used was extensively explained to provide a clear understanding of their application and rationale Facial manipulation in videos is becoming an increasingly widespread issue. This study carefully reviews relevant literature to gain a deeper understanding of the problem and proposes a network architecture that effectively detects such manipulations using five convolutional neural networks, all while ensuring low computational cost. The paper introduces an innovative approach for detecting Deep Fakes. In this method, a CNN face detector is used to extract facial regions from video frames. The distinct spatial features of these faces are captures using ReLU with CNN, aiding in the detection of visual artifacts in the video frames. Under typical internet conditions, the proposed technology achieves an average detection rate of 98% for Deep Fake videos and 95% for Face2Face video, according to the study’s results. The study has demonstrated that CNN performance can be enhanced by adding more convolutional layers and other specific parameters. It also takes into account the compression factors, which presents significant challenges for many Deep Fake detection systems. Future algorithms are expected to concentrate on these issues while utilizing updated datasets. Although this research focused in identifying Deep Fake in still images and videos, we believe that this method identifying Deep Fake in still images and videos, we believe that this method could be applied to detect Deep Fake in audio and text, aiding in the fight against misinformation in the digital age. These areas will be explored in feature investigations.

Copyright

Copyright © 2025 Shwetha A B, Sushmitha R N, S Sahana, Vyshnavi N, Thrisha M. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET74208

Publish Date : 2025-09-11

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here